Encoder

ABSTRACT

A method including generating from a first audio signal, and via a first encoding and decoding of the first audio signal, a second audio signal; determining at least one energy difference value between the first audio signal and the second audio signal; and calculating at least one signal shaping factor dependent on the at least one energy difference value.

FIELD OF THE INVENTION

The present invention relates to coding, and in particular, but notexclusively to speech or audio coding.

BACKGROUND OF THE INVENTION

Audio signals, like speech or music, are encoded for example forenabling an efficient transmission or storage of the audio signals.

Audio encoders and decoders are used to represent audio based signals,such as music and background noise. These types of coders typically donot utilise a speech model for the coding process, rather they useprocesses for representing all types of audio signals, including speech.

Speech encoders and decoders (codecs) are usually optimised for speechsignals, and can operate at either a fixed or variable bit rate.

An audio codec can also be configured to operate with varying bit rates.At lower bit rates, such an audio codec may work with speech signals ata coding rate equivalent to a pure speech codec. At higher bit rates,the audio codec may code any signal including music, background noiseand speech, with higher quality and performance.

In some audio codecs the input signal is divided into a limited numberof bands. Each of the band signals may be quantized. From the theory ofpsychoacoustics it is known that the highest frequencies in the spectrumare usually perceptually less important than the low frequencies. Thisin some audio codecs is reflected by a bit allocation where fewer bitsare allocated to high frequency signals than low frequency signals.

Furthermore some codecs use the correlation between the low and highfrequency bands or regions of an audio signal to improve the codingefficiency with the codecs.

As typically the higher frequency bands of the spectrum are generallyquite similar to the lower frequency bands some codecs may encode onlythe lower frequency bands and reproduce the upper frequency bands as ascaled lower frequency band copy. Thus by only using a small amount ofadditional control information considerable savings can be achieved inthe total bit rate of the codec.

Such techniques for coding the high frequency region are known as highfrequency region (HFR) coding methods. One form of high frequency regioncoding is spectral-band-replication (SBR), which has been developed byCoding Technologies. In SBR, a known audio coder, such as MovingPictures Expert Group MPEG-4 Advanced Audio Coding (AAC) or MPEG-1 LayerIII (MP3) coder, codes the low frequency region. The high frequencyregion is generated separately utilizing the coded low frequency region.

In SBR, the high frequency region is obtained by transposing the lowfrequency region to the higher frequencies. The transposition is basedon a Quadrature Mirror Filters (QMF) filter bank with 32 bands and isperformed such that it is predefined from which band samples each highfrequency band sample is constructed. This is done independently of thecharacteristics of the input signal.

The higher frequency bands are modified based on additional information.The modification is done to make particular features of the synthesizedhigh frequency region more similar with the original one. Additionalcomponents, such as sinusoids or noise, are added to the high frequencyregion to increase the similarity with the original high frequencyregion. Finally, the envelope is adjusted to follow the envelope of theoriginal high frequency spectrum.

Artefacts known as pre and post echo distortion can arise in transformcodecs using perceptual coding rules. Pre-echoes occur when a signalwith a sharp attack follows a section of low energy. Pre-echoes occur insuch situations as a typical block based transform codec performsquantisation and encoding in the frequency domain. In order to satisfymasking thresholds associated with a perceptual measure criteria, thetime-frequency uncertainty dictates that an inverse transformation willspread the quantisation distortion evenly in time throughout thereconstructed block. This results in unmasked distortion throughout thelow energy region preceding in time the higher signal region in thedecoded signal.

A similar effect can be perceived when there is a sudden offset in thesignal. In this case the quantisation noise is spread into thesubsequent low energy region after the encoded signal is transformedback into the time domain. This distortion is known as a post echo.

Pre and post echoes may be reduced by selecting a smaller window size insections of the signal where there are transients. However, in manyapplications this is not always possible since a fixed delay ortransform size is required. Another technique used to reduce the effectof pre/post echo distortion is Temporal Noise Shaping (TNS), whereby anadaptive predictive analysis filter is applied to the coefficients inthe frequency domain. This has the effect of shaping the noise in thetime domain, thereby concentrating the quantisation noise mostly intothe high energy regions of the signal.

These methods have been found to be generally effective for controllingpre and post echo in coding schemes which either code the audio signalas a full band signal (in other words the whole spectrum is encoded by asingle method), or the audio signal to be decoded contains a largeproportion of lower frequency components, such as that found in the lowband of an audio coding system employing the spilt band or SBR approach.However the signal in the upper band of a SBR approach to audio codingcan exhibit very different signal characteristics to the correspondinglow band and as such the methods do not produce efficient pre and postecho distortion suppression.

SUMMARY OF THE INVENTION

This invention proceeds from the consideration that the previouslydescribed methods for controlling pre and post echo are not optimisedfor the high band signal characteristics in a split band or SBR approachto audio coding.

Embodiments of the present invention aim to address the above problem.

There is provided according to a first aspect of the present invention amethod of encoding an audio signal comprising: generating from a firstaudio signal, and via a first encoding and decoding of the first audiosignal, a second audio signal; determining at least one energydifference value between the first audio signal and the second audiosignal; and calculating at least one signal shaping factor dependent onthe at least one energy difference value.

The method may further comprise partitioning the first audio signal intoa plurality of segments.

The segments are preferably at least one of: time segments; frequencysegments; time and frequency segments.

Calculating the at least one signal shaping factor may comprise:comparing the at least one energy difference value for at least one ofthe plurality of segments of the second audio signal against a thresholdvalue; and determining a value of the signal shaping factor associatedwith the at least one of the plurality of segments dependent on theresult of the comparing the at least one energy difference value for atleast one of the plurality of segments of the second audio signalagainst the threshold value.

Determining at least one energy difference value may further comprisedetermining at least two successive energy difference values forrespective at least two successive segments of the first audio signaland at least two successive corresponding segments of the second audiosignal.

Calculating at least one signal shaping factor may further comprisecomparing the at least two energy difference values against a thresholdin order to determine the signal shaping factor for at least one segmentof the plurality of segments for the second audio signal.

The method may further comprise generating a signal shaping factorcontrol signal dependent on the signal shaping factor for each of theplurality of segments of the second audio signal.

The energy difference value is preferably dependent on the energy of atleast one segment from the first audio signal and the energy of at leastone segment from the second audio signal.

The energy difference value is preferably the ratio of the energy of atleast one segment of the first audio signal to the energy of at leastone segment of the second audio signal.

The first audio signal is preferably an unprocessed audio signal, andwherein the second audio signal is preferably a synthetic audio signal.

The first audio signal and the second audio signal are preferably higherfrequency audio signals.

According to a second aspect of the present invention there is provideda method of decoding an audio signal comprising: receiving an encodedsignal comprising at least in part a signal shaping factor signal;decoding the encoded signal to produce a synthetic audio signal;determining at least one signal shaping factor the synthetic signal fromthe received gain factor signal; and applying the at least one signalshaping factor to the synthetic audio signal.

The method may further comprise partitioning the synthetic audio signalinto a plurality of segments.

The segment is preferably at least one of: a time segment; a frequencysegment; a time and frequency segment.

The determining at least one signal shaping factor may comprisedetermining at least one signal shaping factor for each one of theplurality of segments of the synthetic signal.

Applying the at least one signal shaping factor to the synthetic audiosignal may comprise applying the at least one signal shaping factor foreach one of the plurality of segments to the synthetic audio signal

Determining the at least one signal shaping factor function maycomprise: decoding at least one signal shaping factor from the signalshaping factor signal; adding the at least one signal shaping factor toa track of previous at least one signal shaping factor; andinterpolating the at least one signal shaping factor with the least oneprevious signal shaping factor from the track of signal shaping factors;and interpolating the previous signal shaping factor with the at leastone signal shaping factor.

The interpolating is preferably a linear interpolating.

The interpolating is preferably a non-linear interpolating.

According to a third aspect of the invention there is provided anencoder for encoding an audio signal comprising: a first coder-decoderconfigured to generate from a first audio signal a second audio signal;a signal comparator configured to determine at least one energydifference value between the first audio signal and the second audiosignal; a signal processor configured to calculate at least one signalshaping factor dependent on the at least one energy difference value.

The encoder may further comprise a signal partitioner configured topartition the first audio signal into a plurality of segments.

The segments are preferably at least one of: time segments; frequencysegments; time and frequency segments.

The signal processor is preferably further configured to: compare the atleast one energy difference value for at least one of the plurality ofsegments of the second audio signal against a threshold value; anddetermine a value of the signal shaping factor associated with the atleast one of the plurality of segments dependent on the result of thecomparison of the at least one energy difference value for at least oneof the plurality of segments of the second audio signal against thethreshold value.

The signal comparator is preferably configured to determine at least twosuccessive energy difference values for respective at least twosuccessive segments of the first audio signal and at least twosuccessive corresponding segments of the second audio signal.

The signal processor is preferably further configured to compare the atleast two energy difference values against a threshold in order todetermine the signal shaping factor for at least one segment of theplurality of segments for the second audio signal.

The signal processor is preferably further configured to generate asignal shaping factor control signal dependent on the signal shapingfactor for each of the plurality of segments of the second audio signal.

The energy difference value is preferably dependent on the energy of atleast one segment from the first audio signal and the energy of at leastone segment from the second audio signal.

The energy difference value is preferably the ratio of the energy of atleast one segment of the first audio signal to the energy of at leastone segment of the second audio signal.

The first audio signal is preferably an unprocessed audio signal, andwherein the second audio signal is preferably a synthetic audio signal.

The first audio signal and the second audio signal are preferably higherfrequency audio signals.

According to a fourth aspect of the present invention there is provideda decoder for decoding an audio signal configured to: receive an encodedsignal comprising at least in part a signal shaping factor signal;decode the encoded signal to produce a synthetic audio signal; determineat least one signal shaping factor for the synthetic signal from thereceived signal shaping factor signal; and apply the at least one signalshaping factor to the synthetic audio signal.

The decoder may be further configured to partition the synthetic audiosignal into a plurality of segments.

The segments are at least one of: time segments; frequency segments;time and frequency segments.

The decoder is preferably configured to determine the at least onesignal shaping factor by determining at feast one signal shaping factorfor each one of the plurality of segments of the synthetic signal.

The decoder is preferably configured to apply the at least one signalshaping factor to the synthetic audio signal by applying the at leastone signal shaping factor for each one of the plurality of segments tothe synthetic audio signal

The decoder is preferably configured to determine the at least onesignal shaping factor function by: decoding at least one signal shapingfactor from the signal shaping factor signal; adding the at least onesignal shaping factor to a track of previous at least one signal shapingfactor; interpolating the at least one signal shaping factor with theleast one previous signal shaping factor from the track of signalshaping factors; and interpolating the previous signal shaping factorwith the at least one signal shaping factor.

The interpolating is preferably a linear interpolation.

The interpolating is preferably a non-linear interpolation.

Apparatus may comprise an encoder as described above.

Apparatus may comprise a decoder as described above.

An electronic device may comprise an encoder as described above.

An electronic device may comprise a decoder as described above.

According to a fifth aspect of the present invention there is provided acomputer program product configured to perform a method for encoding anaudio signal comprising: generating from a first audio signal, and via afirst encoding and decoding of the first audio signal, a second audiosignal; determining at least one energy difference value between thefirst audio signal and the second audio signal; and calculating at leastone signal shaping factor dependent on the at least one energydifference value.

According to a sixth aspect of the present invention there is provided acomputer program product configured to perform a method for decoding anaudio signal comprising: receiving an encoded signal comprising at leastin part a signal shaping factor signal; decoding the encoded signal toproduce a synthetic audio signal; determining at least one signalshaping factor the synthetic signal from the received gain factorsignal; and applying the at least one signal shaping factor to thesynthetic audio signal.

According to a seventh aspect of the present invention there is providedan encoder for encoding an audio signal comprising: codec means forgenerating from a first audio signal a second audio signal; first signalprocessing means configured to determine at least one energy differencevalue between the first audio signal and the second audio signal; secondsignal processing means configured to calculate at least one signalshaping factor dependent on the at least one energy difference value.

According to an eighth aspect of the present invention there is provideda decoder for decoding an audio signal, comprising: receiving means foraccepting an encoded signal comprising at least in part a signal shapingfactor signal; decoding means for decoding the encoded signal to producea synthetic audio signal; first signal processing means for determiningat least one signal shaping factor for the synthetic signal from thereceived signal shaping factor signal; and second signal processingmeans for applying the at least one signal shaping factor to thesynthetic audio signal.

BRIEF DESCRIPTION OF DRAWINGS

For better understanding of the present invention, reference will now bemade by way of example to the accompanying drawings in which:

FIG. 1 shows schematically an electronic device employing embodiments ofthe invention;

FIG. 2 shows schematically an audio codec system employing embodimentsof the present invention;

FIG. 3 shows schematically an encoder part of the audio codec systemshown in FIG. 2;

FIG. 4 shows schematically a decoder part of the audio codec systemshown in FIG. 2;

FIG. 5 shows an example of gain track interpolation as employed inembodiments of the invention;

FIG. 6 shows a flow diagram illustrating the operation of an embodimentof the audio encoder as shown in FIG. 3 according to the presentinvention; and

FIG. 7 shows a flow diagram illustrating the operation of an embodimentof the audio decoder as shown in FIG. 3 according to the presentinvention.

DESCRIPTION OF PREFERRED EMBODIMENTS OF THE INVENTION

The following describes in more detail possible apparatus and mechanismsfor the provision of pre and post echo control in a high band signalcomponent of an audio codec. In this regard reference is first made toFIG. 1 schematic block diagram of an exemplary electronic device 10,which may incorporate a codec according to an embodiment of theinvention.

The electronic device 10 may for example be a mobile terminal or userequipment of a wireless communication system.

The electronic device 10 comprises a microphone 11, which is linked viaan analogue-to-digital converter 14 to a processor 21. The processor 21is further linked via a digital-to-analogue converter 32 to loudspeakers33. The processor 21 is further linked to a transceiver (TX/RX) 13, to auser interface (UI) 15 and to a memory 22.

The processor 21 may be configured to execute various program codes. Theimplemented program codes comprise an audio encoding code for encoding alower frequency band of an audio signal and a higher frequency band ofan audio signal. The implemented program codes 23 further comprise anaudio decoding code. The implemented program codes 23 may be stored forexample in the memory 22 for retrieval by the processor 21 wheneverneeded. The memory 22 could further provide a section 24 for storingdata, for example data that has been encoded in accordance with theinvention.

The encoding and decoding code may in embodiments of the invention beimplemented in hardware or firmware.

The user interface 15 enables a user to input commands to the electronicdevice 10, for example via a keypad, and/or to obtain information fromthe electronic device 10, for example via a display. The transceiver 13enables a communication with other electronic devices, for example via awireless communication network.

It is to be understood again that the structure of the electronic device10 could be supplemented and varied in many ways.

A user of the electronic device 10 may use the microphone 11 forinputting speech that is to be transmitted to some other electronicdevice or that is to be stored in the data section 24 of the memory 22.A corresponding application has been activated to this end by the uservia the user interface 15. This application, which may be run by theprocessor 21, causes the processor 21 to execute the encoding codestored in the memory 22.

The analogue-to-digital converter 14 converts the input analogue audiosignal into a digital audio signal and provides the digital audio signalto the processor 21.

The processor 21 may then process the digital audio signal in the sameway as described with reference to FIGS. 2 and 3.

The resulting bit stream is provided to the transceiver 13 fortransmission to another electronic device. Alternatively, the coded datacould be stored in the data section 24 of the memory 22, for instancefor a later transmission or for a later presentation by the sameelectronic device 10.

The electronic device 10 could also receive a bit stream withcorrespondingly encoded data from another electronic device via itstransceiver 13. In this case, the processor 21 may execute the decodingprogram code stored in the memory 22. The processor 21 decodes thereceived data, and provides the decoded data to the digital-to-analogueconverter 32. The digital-to-analogue converter 32 converts the digitaldecoded data into analogue audio data and outputs them via theloudspeakers 33. Execution of the decoding program code could betriggered as well by an application that has been called by the user viathe user interface 15.

The received encoded data could also be stored instead of an immediatepresentation via the loudspeakers 33 in the data section 24 of thememory 22, for instance for enabling a later presentation or aforwarding to still another electronic device.

It would be appreciated that the schematic structures described in FIGS.2 to 4 and the method steps in FIGS. 6 and 7 represent only a part ofthe operation of a complete audio codec as exemplarily shown implementedin the electronic device shown in FIG. 1.

The general operation of audio codecs as employed by embodiments of theinvention is shown in FIG. 2. General audio coding/decoding systemsconsist of an encoder and a decoder, as illustrated schematically inFIG. 2. Illustrated is a system 102 with an encoder 104, a storage ormedia channel 106 and a decoder 108.

The encoder 104 compresses an input audio signal 110 producing a bitstream 112, which is either stored or transmitted through a mediachannel 106. The bit stream 112 can be received within the decoder 108.The decoder 108 decompresses the bit stream 112 and produces an outputaudio signal 114. The bit rate of the bit stream 112 and the quality ofthe output audio signal 114 in relation to the input signal 110 are themain features, which define the performance of the coding system 102.

FIG. 3 shows schematically an encoder 104 according to an embodiment ofthe invention. The encoder 104 comprises an input 203 arranged toreceive an audio signal.

The input 203 is connected to a band splitter 230, which divides thesignal into an upper frequency band (also known as a higher frequencyregion) and a lower frequency band (also known as a lower frequencyregion). The lower frequency band output from the band splitter isconnected to the lower frequency region coder (otherwise known as thecore codec) 231. The lower frequency region coder 231 is furtherconnected to the higher frequency region coder 232 and is configured topass information about the coding of the lower frequency region for thehigher frequency region coding process.

The higher frequency band output from the band splitter is arranged tobe connected to the higher frequency region (HFR) coder 232. The HFRcoder is configured to output a synthetic audio signal which is arrangedto be connected to the input of the pre/post echo control processor,233.

In addition to receiving an input from the HFR coder the pre/post echocontrol processor 233 is further arranged to receive, as an additionalinput, the original higher frequency band signal as outputted from theband splitter 230.

The lower frequency region (LFR) coder 231, the HFR coder, 232 and thepre/post echo control processor are configured to output signals to thebitstream formatter 234 (which in some embodiments of the invention isalso known as the bitstream multiplexer). The bitstream formatter 234 isconfigured to output the output bitstream 112 via the output 205.

The operation of these components is described in more detail withreference to the flow chart shown in FIG. 6 showing the operation of theencoder 104.

The audio signal is received by the encoder 104. In a first embodimentof the invention the audio signal is a digitally sampled signal. Inother embodiments of the present invention the audio input may be ananalogue audio signal, for example from a microphone 6, which isanalogue to digitally (A/D) converted. In further embodiments of theinvention the audio input is converted from a pulse code modulationdigital signal to amplitude modulation digital signal. The receiving ofthe audio signal is shown in FIG. 6 by step 601.

The band splitter 230 receives the audio signal and divides the signalinto a higher frequency band signal and a lower frequency band signal.In some embodiments of the present invention the dividing of the audiosignal into higher frequency and lower frequency band signals may takethe form of low pass filtering (to produce the lower frequency bandsignal) and high pass filtering (to produce the higher frequency bandsignal) of the audio signal in order to effectuate the division of thesignal into bands.

Typically the process may be followed by a down sampling stage of therespective filtered signals in order to achieve two base band signals.For example, a down sampling factor of two may be used in order toachieve two base band signals of equal bandwidth.

In further embodiments of the present invention the splitting of thesignal may be effectuated by utilising a quadrature mirror filter (QMF)structure whereby the aliasing components introduced by the analysisfiltering stage are effectively cancelled by each other when the signalis reconstructed at the synthesis stage in the decoder.

This division of the signal into higher frequency and lower frequencyband signals is shown in FIG. 6, by step 603.

The lower frequency region (LFR) coder 231 as described above receivesthe lower frequency band (and optionally down sampled) audio signal andapplies a suitable low frequency coding upon the signal. In anembodiment of the invention the lower frequency region coder 231 mayapply quantisation and Huffman coding to sub-bands of the lowerfrequency region audio signal. The input signal 110 to the lowerfrequency region coder 231 may in these embodiments be divided intosub-bands using an analysis filter bank structure. Each sub-band may bequantized and coded utilizing the information provided by apsychoacoustic model. The quantisation settings as well as the codingscheme may be chosen dependent on the psychoacoustic model applied.

The quantised, coded information is sent to the bit stream formatter 234for creating a bit stream 112.

Furthermore, the low frequency coder 231 provides a frequency domainrealization of synthesized LFR signal. This realization may be passed tothe HFR coder 232, in order to effectuate the coding of the higherfrequency region.

This lower frequency coding is shown in FIG. 6 by step 606.

In other embodiments of the invention other low frequency codecs may beemployed in order to generate the core coding output which is output tothe bitstream formatter 234. Examples of these further embodiment lowfrequency codecs include but are not limited to advanced audio coding(AAC), MPEG layer 3 (MP3), the ITU-T Embedded variable rate (EV-VRR)speech coding baseline codec, and ITU-T G.729.1.

The higher frequency band signal output from the band splitter, 230, maythen be received by the high frequency region (HFR) coder, 232. In afirst embodiment of the present invention this higher frequency bandsignal may be encoded with a spectral band replication type algorithm,where spectral information from the coding of the lower frequency bandis used to replicate the higher frequency band spectral structure. Infurther embodiments of the present invention this higher frequency bandsignal may be encoded with a higher frequency region coder that maysolely act on the higher frequency band signal to be encoded and doesnot employ information from the lower frequency band to assist in theprocess.

This high frequency region coding stage is exemplary depicted by step607, in FIG. 6.

As part of the higher frequency band encoding process, the codec mayproduce a synthetic audio signal output. This is a representation orestimation of the decoded signal but produced locally at the encoder. Inan exemplary embodiment of the present invention this higher frequencyband synthetic signal may be divided into segments along with theoriginal higher frequency band signal. The length of the segment may bearbitrarily chosen, but typically it will be related to the samplingfrequency of the signal. This segmentation of the original and syntheticsignals is depicted by step 609 in FIG. 6.

The pre/post echo control processor 233 may determine an energy value ofeach segment for the synthetic and original higher frequency bandsignals. This stage is represented in FIG. 7 by step 611.

Furthermore the pre/post echo control processor 233 may determine ameasure of the relative difference in energy between correspondingsegments of the synthetic and original signals using the determinedenergy values of each segment for the synthetic and original higherfrequency band signals. This determination of the measure of therelative difference in energy stage is represented in FIG. 6 by step613.

The pre/post echo control processor 233 may also in embodiments of theinvention track the determined measure of relative difference in theenergy for the synthetic and original higher frequency band signalsacross successive segments and compare the determined measure against apredetermined threshold value in order to ascertain if there is adiscrepancy between the original and synthetic signals due to pre orpost echo. This tracking process is shown in FIG. 6 by step 617.

The pre/post echo control processor 233 may then pass informationregarding the comparison of the energy difference against the thresholdvalue for each segment to the bit stream formatter 234. This is shown inFIG. 6 by step 619.

The bitstream formatter 234 receives the low frequency coder 231 output,the high frequency region coder 232 output and the selection output fromthe pre/post echo control processor 233 and formats the bitstream toproduce the bitstream output. The bitstream formatter 234 in someembodiments of the invention may interleave the received inputs and maygenerate error detecting and error correcting codes to be inserted intothe bitstream output 112.

In the embodiment described hereafter the encoding of the presentinvention is exemplarily described with respect to a specific example,but it is to be understood that this example is not limiting and isincluded to enhance the understanding of the invention.

In this exemplary embodiment of the present invention of the encoderx_(orig)(n) is the original higher frequency band signal, and x_(syn)(n)is the locally generated higher frequency band synthesized version.Initially both signals may be divided into segments of length N samples.For example a suitable length of segment was found to be 2.5 ms, andwhich for a 32 kHz sampled signal results in an analysis frame length of80 samples. However, it is to be understood other embodiments of thepresent invention may implement the invention with segments of differentlength.

In this example the k′th segment of the original and synthesized signalsare denoted as x_(orig) ^(k)(n) and x_(syn) ^(k)(n), where nε0, . . . ,N−1, respectively.

Furthermore the pre/post echo control processor 233 may determine anenergy value of each segment for the synthetic and original higherfrequency band signals according to mean square value of each sample.Thus

${E_{orig}^{k} = {\frac{1}{N}{\sum\limits_{n = 0}^{N - 1}\; \left( {x_{orig}^{k}(n)} \right)^{2}}}},{E_{syn}^{k} = {\frac{1}{N}{\sum\limits_{n = 0}^{N - 1}\; {\left( {x_{syn}^{k}(n)} \right)^{2}.}}}}$

where E_(orig) is the energy for the original higher frequency bandsignal and E_(syn) is the energy for the synthetic higher frequency bandsignal. However, it is to be understood that further embodiments of thepresent invention may use different energy measures, for example a nonlimiting list may include; the root mean square value (RMS) or the meanof the magnitude of the band signal.

The pre/post echo control processor 233 may determine the relativedifference in energy between corresponding segments of the synthetic andoriginal signals by determining the ratio of the respective energies.Thus in this example the relative difference for the k′th segment metricd^(k) is given by:

$d^{k} = {\frac{E_{orig}^{k}}{E_{syn}^{k}}.}$

It is to be understood, however, that other difference energy metricsmay be employed in further embodiments for the present invention. Forexample some embodiments may implement the difference energy metric as asimple difference, such as the difference of the magnitude of theenergies.

The pre/post echo control processor 233 may then track the differenceenergy metric d^(k) across segments and define a logarithmic domain gainparameter g^(k) dependent on the segment difference energy metric withrespect to the predefined difference energy threshold a based on theenergy ratios in two successive segments. The logic presented in table 1may then be used in the determination of g^(k).

Table 1 exemplarily depicts a pseudo code logic for obtaining gainvalues g^(k) in an embodiment of the present invention.

For every k : If (d^(k) < {circumflex over (d)} and d^(k−1) <{circumflex over (d)}){  g^(k) = ĝ  g^(k−1) = ĝ } else {  g^(k) = 0 }

Typically for embodiments of the invention {circumflex over (d)} and ĝare experimentally chosen values. Also ĝ may, in some embodiments of theinvention, be selected to be a negative value. It is to be noted, inthis embodiment of the invention, that if both the current energydifference metric d^(k) and the previous energy difference metricd^(k−1) are below {circumflex over (d)}, then the value of the gainparameter of the previous segment, g^(k−1), is also modified.

In this particular embodiment of the invention g^(k) may only be one oftwo values. Thus in this example only one bit may be submitted to thedecoder in order to describe the value of g^(k) in a segment k.

An advantage thus in embodiments of the invention such as describedabove is that this improvement only requires a very low additional bitrate over previous methods of controlling pre and post echo.

To further assist the understanding of the invention the operation ofthe decoder 108 with respect to the embodiments of the invention isshown with respect to the decoder schematically shown in FIG. 4 and theflow chart showing an example of the operation of the decoder in FIG. 7.

The decoder comprises an input 313 from which the encoded bitstream 112may be received. The input 313 is connected to the bitstream unpacker301.

The bitstream unpacker demultiplexes, partitions, or unpacks the encodedbitstream 112 into three separate bitstreams. The lower frequency regionencoded bitstream is passed to the lower frequency region decoder 303,the higher frequency region encoded bitstream is passed to the higher,frequency region reconstructor/decoder 307 (also known as a highfrequency region decoder) and the echo control bitstream is passed tothe echo control signal modification processor 305.

This unpacking process is shown in FIG. 7 by step 701.

The lower frequency region decoder 303 receives the lower frequencyregion encoded data and constructs a synthesized lower frequency signalby performing the inverse process to that performed in the lowerfrequency region coder 231. If the higher frequency region codec employsa SBR type algorithm then this synthesized lower frequency region signalmay be passed to the higher frequency region decoder/reconstructor 307.In addition the synthetic output of the lower frequency region decodermay be further arranged to form one of the inputs to the bandcombiner/synthesis filter, 309.

This lower frequency region decoding process is shown in FIG. 7 by step707.

The higher frequency region decoder or reconstructor 307, on receivingthe higher frequency region encoded data constructs a synthesised highfrequency signal by performing the inverse process to that performed inthe higher frequency region coder 232.

The higher frequency region construction or decoding is shown in FIG. 7by step 705.

The output of the higher frequency region decoder is then arranged to bepassed to the pre/post echo control signal modification unit 305. Onreceiving the higher frequency region synthesised signal, the echosignal modification unit will parse the echo control bit stream, and foreach corresponding segment of the synthesised signal determine if thetime envelope of the segment requires modification by a gain factor.

In addition, in some embodiments of the invention interpolation may beapplied to the gain factor across the length of the segment, if thesignal modification gain is deemed to change at the boundaries of thesaid segment. The variable gain function, as well as the previouslydescribed gain, may also be known as a signal shaping function as itproduces a signal shaping effect. The signal shaping function whenapplied may have the effect of smoothing out any energy transitions inthe time envelope window from one segment to the next. In someembodiments of the present invention it may be necessary to monitor thesignal modification gain track from one segment to the next in order todetermine the exact signal shaping function to be applied across thesegment.

The process of determining if a particular segment requires echo controlmodification is depicted by step 703 in FIG. 7. The mechanism ofdeploying signal modification to the synthesised higher frequency regionsignal is further depicted by step 709 in FIG. 7.

The signal reconstruction processor 309 receives the decoded lowerfrequency region signal and the decoded or reconstructed higherfrequency region signal, and forms a full band or spectral signal byusing the inverse of the process used to split the signal spectrum intotwo bands or regions at the encoder, as exemplary depicted by 230. Insome embodiments of the present invention this may be achieved by usinga synthesis filter bank structure if the equivalent analysis bank isemployed at the encoder. An example of such an analysis synthesis filterbank structure may be a QMF filter bank.

This reconstruction of the signal into a full band signal is shown inFIG. 7 by step 711.

In an example of an embodiment of the present invention the gainparameters g^(k) may be arranged to form a gain track g(n) (a signalshaping factor) at the decoder. If the gain/signal shaping factor valuewas then seen to change at the segment boundaries linear interpolationmay be used in order to smooth out the gain transition as the segment istraversed. This is exemplary depicted in FIG. 5. In the example shown inFIG. 5 a gain track g(n) 551 is shown for a series of consecutivesegments. There is shown 4 segments, the k−2 segment 501, the k−1segment 503 the k segment 505 and the k+1 segment 507. In the exampleshown the k−2 segment 501 has as signal shaping factor of 0 and the ksegment 505 has a signal shaping factor of 0. The intermediate segment,in order that there is a gradual change from the k−2 segment to the ksegment has applies a linear transform to the gain function to each ofthe samples in the k−1 sample. In other words the first sample in thek−1 segment 503 has a value near the value of the k−2 segment lastsample 511 and the value of the k−1 segment last sample 513 has a valuenear to the value of the first sample of the k segment 505.

It is to be understood that in further embodiments of the invention thatdifferent interpolation schemes may be adopted. For example, it may bepossible to adopt a non linear scheme.

The synthesized signal x_(syn)(n) may then be modified by using the gaintrack/signal shaping factor g(n). Should a logarithmic gain parameter beused, then the higher frequency region synthetic signal may be modifiedas follows.

{circumflex over (x)} _(syn)(n)=x _(syn)(n)10^(g(n)),

where {circumflex over (x)}_(syn)(n) is the modified synthesized signal.Further, in this exemplary embodiment of the invention it may be notedthat when g(n) is zero, there is no energy difference between originaland synthesized signals, and {circumflex over (x)}_(syn)(n) is equal tox_(syn)(n).

In one embodiment of the invention the temporal envelope shapingtechnique may be used to control the pre and post echo for a higherfrequency region synthesised signal for frequencies within the region of7 kHz to 14 kHz, and where the overall sampling frequency of the codecis 32 kHz. For this particular example, the higher frequency regioncodec utilises a frame size of 20 ms or 640 samples. The frame may bedivided into 8 segments where each segment may be a length of 80samples. At the encoder the fixed values may be selected to be:

d^(k)=0.2

ĝ=−0.5

Since there are 8 segments per frame, 8 bits may be used in order torepresent the echo control information for the frame. For thisparticular example of an embodiment of the present invention the echocontrol information would only result in an overhead of 0.4 kbits/sec.

One advantage of this invention is that it provides an efficient, lowcomplexity and low bit rate solution to the problem of echo controltemporal envelope shaping. The method was found to be especiallysuitable for those audio codec architectures which deploy high bandcoding at a frequency range greater than 7 kHz.

Although the above embodiments have been described in terms of a splitfrequency region/band architecture whereby the signal has been dividedinto a higher frequency region and a lower frequency region, it is to beunderstood that further embodiments of the present invention may bedeployed with different numbers of split frequency regions in differentcoding architectures.

For example each of the lower and higher frequency regions may befurther subdivided into sub-regions or sub-bands and a lower frequencysub-band associated with a higher frequency sub-band. In suchembodiments of the invention the associated sub-bands are compared andthe gain factor/shaping factors are determined for each sub-band of eachsegment. Although this further division increases the information havingto be passed from the encoder to the decoder it results in signalshaping factors being targeted to assist in the reduction of echoerrors.

In further embodiments of the invention it may be possible to examineeach signal segment across the full band of the signal thereby removingthe need for a mechanism to divide the signal into multiple bands. Thisfor example may be further advantageous if the signal characteristicsexhibit features which may typically be found in a high band. Oneexample of these features may occur if the signal is unstructured andnoise like, such as that found in an unvoiced sound.

The embodiments of the invention described above describe the codec interms of separate encoders 104 and decoders 108 apparatus in order toassist the understanding of the processes involved. However, it would beappreciated that the apparatus, structures and operations may beimplemented as a single encoder-decoder apparatus/structure/operation.Furthermore in some embodiments of the invention the coder and decodermay share some/or all common elements.

Although the above examples describe embodiments of the inventionoperating within a codec within an electronic device 610, it would beappreciated that the invention as described below may be implemented aspart of any variable rate/adaptive rate audio (or speech) codec. Thus,for example, embodiments of the invention may be implemented in an audiocodec which may implement audio coding over fixed or wired communicationpaths.

Thus user equipment may comprise an audio codec such as those describedin embodiments of the invention above.

It shall be appreciated that the term user equipment is intended tocover any suitable type of wireless user equipment, such as mobiletelephones, portable data processing devices or portable web browsers.

Furthermore elements of a public land mobile network (PLMN) may alsocomprise audio codecs as described above.

In general, the various embodiments of the invention may be implementedin hardware or special purpose circuits, software, logic or anycombination thereof. For example, some aspects may be implemented inhardware, while other aspects may be implemented in firmware or softwarewhich may be executed by a controller, microprocessor or other computingdevice, although the invention is not limited thereto. While variousaspects of the invention may be illustrated and described as blockdiagrams, flow charts, or using some other pictorial representation, itis well understood that these blocks, apparatus, systems, techniques ormethods described herein may be implemented in, as non-limitingexamples, hardware, software, firmware, special purpose circuits orlogic, general purpose hardware or controller or other computingdevices, or some combination thereof.

The embodiments of this invention may be implemented by computersoftware executable by a data processor of the mobile device, such as inthe processor entity, or by hardware, or by a combination of softwareand hardware. Further in this regard it should be noted that any blocksof the logic flow as in the Figures may represent program steps, orinterconnected logic circuits, blocks and functions, or a combination ofprogram steps and logic circuits, blocks and functions.

The memory may be of any type suitable to the local technicalenvironment and may be implemented using any suitable data storagetechnology, such as semiconductor-based memory devices, magnetic memorydevices and systems, optical memory devices and systems, fixed memoryand removable memory. The data processors may be of any type suitable tothe local technical environment, and may include one or more of generalpurpose computers, special purpose computers, microprocessors, digitalsignal processors (DSPs) and processors based on multi-core processorarchitecture, as non-limiting examples.

Embodiments of the inventions may be practiced in various componentssuch as integrated circuit modules. The design of integrated circuits isby and large a highly automated process. Complex and powerful softwaretools are available for converting a logic level design into asemiconductor circuit design ready to be etched and formed on asemiconductor substrate.

Programs, such as those provided by Synopsys, Inc. of Mountain View,Calif. and Cadence Design, of San Jose, Calif. automatically routeconductors and locate components on a semiconductor chip using wellestablished rules of design as well as libraries of pre-stored designmodules. Once the design for a semiconductor circuit has been completed,the resultant design, in a standardized electronic format (e.g., Opus,GDSII, or the like) may be transmitted to a semiconductor fabricationfacility or “fab” for fabrication.

The foregoing description has provided by way of exemplary andnon-limiting examples a full and informative description of theexemplary embodiment of this invention. However, various modificationsand adaptations may become apparent to those skilled in the relevantarts in view of the foregoing description, when read in conjunction withthe accompanying drawings and the appended claims. However, all such andsimilar modifications of the teachings of this invention will still fallwithin the scope of this invention as defined in the appended claims.

1. A method comprising: generating from a first audio signal, and via afirst encoding and decoding of the first audio signal, a second audiosignal; determining at least one energy difference value between thefirst audio signal and the second audio signal; and calculating at leastone signal shaping factor dependent on the at least one energydifference value.
 2. The method as claimed in claim 1, furthercomprising: partitioning the first audio signal into a plurality ofsegments.
 3. The method as claimed in claim 2, wherein the segments areat least one of: time segments; frequency segments; time and frequencysegments.
 4. The method as claimed in claim 2, wherein calculating theat least one signal shaping factor comprises: comparing the at least oneenergy difference value for at least one of the plurality of segments ofthe second audio signal against a threshold value; and determining avalue of the signal shaping factor associated with the at least one ofthe plurality of segments dependent on the result of the comparing theat least one energy difference value for at least one of the pluralityof segments of the second audio signal against the threshold value. 5.The method as claimed in claim 2, wherein determining at least oneenergy difference value further comprises: determining at least twosuccessive energy difference values for respective at least twosuccessive segments of the first audio signal and at least twosuccessive corresponding segments of the second audio signal.
 6. Themethod as claimed in claim 5, wherein the calculating at least onesignal shaping factor further comprises comparing the at least twoenergy difference values against a threshold in order to determine thesignal shaping factor for at least one segment of the plurality ofsegments for the second audio signal.
 7. The method as claimed in claim2, the method further comprises generating a signal shaping factorcontrol signal dependent on the signal shaping factor for each of theplurality of segments of the second audio signal.
 8. The method asclaimed in claim 2, wherein the energy difference value is dependent onthe energy of at least one segment from the first audio signal and theenergy of at least one segment from the second audio signal.
 9. Themethod as claimed in claim 8, wherein the energy difference value is theratio of the energy of at least one segment of the first audio signal tothe energy of at least one segment of the second audio signal.
 10. Themethod as claimed in claim 1, wherein the first audio signal is anunprocessed audio signal, and wherein the second audio signal is asynthetic audio signal.
 11. The method as claimed in claim 1, whereinthe first audio signal and the second audio signal are higher frequencyaudio signals.
 12. A method comprising: receiving an encoded signalcomprising at least in part a signal shaping factor signal; decoding theencoded signal to produce a synthetic audio signal; determining at leastone signal shaping factor the synthetic signal from the received gainfactor signal; and applying the at least one signal shaping factor tothe synthetic audio signal.
 13. The method as claimed in claim 12,further comprising: partitioning the synthetic audio signal into aplurality of segments.
 14. The method as claimed in claim 13, whereinthe segment is at least one of: a time segment; a frequency segment; atime and frequency segment.
 15. The method as claimed in claim 13,wherein the determining at least one signal shaping factor comprisesdetermining at least one signal shaping factor for each one of theplurality of segments of the synthetic signal.
 16. The method as claimedin claim 13, wherein applying the at least one signal shaping factor tothe synthetic audio signal comprises applying the at least one signalshaping factor for each one of the plurality of segments to thesynthetic audio signal.
 17. The method as claimed in claim 12, whereindetermining the at least one signal shaping factor function comprises:decoding at least one signal shaping factor from the signal shapingfactor signal; adding the at least one signal shaping factor to a trackof previous at least one signal shaping factor; and interpolating the atleast one signal shaping factor with the least one previous signalshaping factor from the track of signal shaping factors; andinterpolating the previous signal shaping factor with the at least onesignal shaping factor. 18-46. (canceled)
 47. An apparatus comprising: atleast one processor and at least one memory including computer programcode, the at least one memory and the computer program code configuredto, with the at least one processor, cause the apparatus at least to:generate from a first audio signal, and via a first encoding anddecoding of the first audio signal, a second audio signal; determine atleast one energy difference value between the first audio signal and thesecond audio signal; and calculate at least one signal shaping factordependent on the at least one energy difference value.
 48. The apparatusas claimed in claim 47, wherein the at least one memory and the computerprogram code are further configured to, with the at least one processor,cause the apparatus at least to: partition the first audio signal into aplurality of segments.
 49. The apparatus as claimed in claim 48, whereinthe segments are at least one of: time segments; frequency segments;time and frequency segments.
 50. The apparatus as claimed in claim 48,wherein the at least one memory and the computer program code configuredto, with the at least one processor, cause the apparatus at least tocalculate at least one signal shaping factor is further configured tocause the apparatus at least to: compare the at least one energydifference value for at least one of the plurality of segments of thesecond audio signal against a threshold value; and determine a value ofthe signal shaping factor associated with the at least one of theplurality of segments dependent on the result of the comparison of theat least one energy difference value for at least one of the pluralityof segments of the second audio signal against the threshold value. 51.The apparatus as claimed in claim 48, wherein the at least one memoryand the computer program code configured to, with the at least oneprocessor, cause the apparatus at least to determine at least one energydifference value is further configured to cause the apparatus at leastto: determine at least two successive energy difference values forrespective at least two successive segments of the first audio signaland at least two successive corresponding segments of the second audiosignal.
 52. The apparatus as claimed in claim 51, wherein the at leastone memory and the computer program code configured to, with the atleast one processor, cause the apparatus at least to calculate at leastone signal shaping factor is further configured to cause the apparatusat least to: compare the at least two energy difference values against athreshold in order to determine the signal shaping factor for at leastone segment of the plurality of segments for the second audio signal.53. The apparatus as claimed in claim 48, wherein the at least onememory and the computer program code are further configured to, with theat least one processor, cause the apparatus at least to: generate asignal shaping factor control signal dependent on the signal shapingfactor for each of the plurality of segments of the second audio signal.54. The apparatus as claimed in claim 48, wherein the energy differencevalue is dependent on the energy of at least one segment from the firstaudio signal and the energy of at least one segment from the secondaudio signal.
 55. The apparatus as claimed in claim 54, wherein theenergy difference value is the ratio of the energy of at least onesegment of the first audio signal to the energy of at least one segmentof the second audio signal.
 56. The apparatus as claimed in claim 47,wherein the first audio signal is an unprocessed audio signal, andwherein the second audio signal is a synthetic audio signal.
 57. Theapparatus as claimed in claim 47, wherein the first audio signal and thesecond audio signal are higher frequency audio signals.
 58. An apparatuscomprising: at least one processor and at least one memory includingcomputer program code, the at least one memory and the computer programcode configured to, with the at least one processor, cause the apparatusat least to: receive an encoded signal comprising at least in part asignal shaping factor signal; decode the encoded signal to produce asynthetic audio signal; determine at least one signal shaping factor forthe synthetic signal from the received signal shaping factor signal; andapply the at least one signal shaping factor to the synthetic audiosignal.
 59. The apparatus as claimed in claim 58, wherein the at leastone memory and the computer program code are further configured to, withthe at least one processor, cause the apparatus at least to: partitionthe synthetic audio signal into a plurality of segments.
 60. Theapparatus as claimed in claim 59, wherein the segment is at least oneof: a time segment; a frequency segment; a time and frequency segment.61. The apparatus as claimed in claim 59, wherein the at least onememory and the computer program code configured to, with the at leastone processor, cause the apparatus at least to determine at least onesignal shaping factor is further configured to cause the apparatus atleast to: determining at least one signal shaping factor for each one ofthe plurality of segments of the synthetic signal.
 62. The apparatus asclaimed in claim 59, wherein the at least one memory and the computerprogram code configured to, with the at least one processor, cause theapparatus at least to apply the at least one signal shaping factor tothe synthetic audio signal is further configured to cause the apparatusat least to: apply the at least one signal shaping factor for each oneof the plurality of segments to the synthetic audio signal.
 63. Theapparatus as claimed in claim 58, wherein the at least one memory andthe computer program code configured to, with the at least oneprocessor, cause the apparatus at least to determine the at least onesignal shaping factor function is further configured to cause theapparatus at least to: decode at least one signal shaping factor fromthe signal shaping factor signal; add the at least one signal shapingfactor to a track of previous at least one signal shaping factor; andinterpolate the at least one signal shaping factor with the least oneprevious signal shaping factor from the track of signal shaping factors;and interpolating the previous signal shaping factor with the at leastone signal shaping factor.